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Question 6 


Intent of Question 


This question was designed to evaluate a student’s ability to make inferences for simple linear regression models. 
Interpreting model parameters and comparing and contrasting different models are important skills that are also 
being assessed. Finally, a multiple regression model with a special variable, an indicator variable, is introduced to 
investigate whether the relationship between the predictor and response variable differs for two different groups 
of people. Students are asked to sketch the estimated line for both groups and interpret the estimated parameters in 
the multiple regression model. 


Solution 
Part (a): 


The value 1.080 estimates the average increase (in feet) in the perceived distance for each additional foot in 
actual distance between the two objects. 


Part (b): 


The model with zero intercept makes more intuitive sense in this particular situation. If the two objects are 
placed side by side (so the actual distance is zero), then we would expect the subjects to say that the distance 
between the objects is zero. 


Part (c): 


Let # denote the true slope between the perceived distances and the actual distances. The researcher’s 
hypothesis is equivalent to # > 1. Thus, we want to conduct a hypothesis test for the slope parameter. 


Step 1: States a correct pair of hypotheses: 


H,:B=1 
A p>! 

Step 2: Correct mechanics, including the value of the test statistic and p-value (or rejection region). 
This is a f-test of a slope. 


O78 _1.102-1 
~ s  0393 


b 
df = 40-1=39 
p-value = P(t > .260) = 0.398 


= 0.260 
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Question 6 (continued) 


Step 3: States a correct conclusion in the context of the problem, using the result of the statistical test. 


Since the p-value 0.398 is greater than 0.05, we cannot reject H,. That is, we do not have statistically 


significant evidence to conclude that the subjects overestimate the distance with the magnitude of the 
overestimation increasing as the actual distance increases. 


Part (d): 


According to Model 3, the estimated models for the two groups are: 


Contact wearers (contact = 1): 
perceived distance = 1.05 (actual distance) + 0.12 (actual distance) 
= 1.17 (actual distance) 


Noncontact wearers (contact = 0): 
perceived distance = 1.05 (actual distance) 














No contacts 























Perceived Distance (feet) 


























12345 67 8 9 10 
Actual Distance (feet) 
Part (e): 


Model 3 allows prediction of perceived distance separately for contact wearers and for noncontact wearers. 
The value of 1.05 estimates the average increase (in feet) in the perceived distance for each one-foot increase 
in actual distance for the population of noncontact wearers. The value of 0.12 estimates the additional 
increase (in feet) in the average perceived distance for each one-foot increase in actual distance for the contact 
wearers. 
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Question 6 (continued) 


Scoring 


Parts (a) and (b) are combined and scored as essentially correct (E), partially correct (P), or incorrect (I). Parts 
(c), (d), and (e) are scored as essentially correct (E), partially correct (P), or incorrect (I). 


Parts (a) and (b) combined is scored as essentially correct (E) if both parts are correct. 


Parts (a) and (b) combined is scored as partially correct (P) if: 
one part is correct and the other part is incorrect; 
OR 
one part is correct and the other part is partially correct; 
OR 
both parts are partially correct. 


Part (a) and (b) combined is scored as incorrect (I) if one part is partially correct. 


Notes: 


Part (a) is scored as partially correct if there is no word that makes it clear that 1.080 is not a deterministic 
increase. 


Part (a) is scored as incorrect if the response: 
e ignores the intercept and implies proportionality: for each foot of actual distance between the two 
objects, the subject perceives about 1.080 feet; 
e consists of the equation rewritten in words. 


Part (b) 


Additional correct statement: 
e The intercept is clearly not statistically significant, so the simpler model that includes only the 
slope is reasonable. 


Partially correct statements: 
e The SE for Model 2 is so large that Model 2 does not seem reasonable. 
e The interpretation of the slope is straightforward if there is a 0 intercept: the percentage error is 
slope — 1 or 10.2 percent. 
e The slope for Model 2 is farther above | than the slope for Model | and so more in line with the 
researcher’s hypothesis. 


Incorrect statements: 
e Having one SE is better than having two. 
e It is simpler/easier/shorter/more accurate to have just one coefficient. 
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Question 6 (continued) 
Part (c) is scored as: 
Essentially correct (E) if three steps are correct. 
Partially correct (P) if two steps are correct. 
Incorrect (1) if one step is correct. 


Notes: 
« Hypotheses: the hypotheses step is incorrect if the alternative hypothesis is two-sided, or if the null 
hypothesis is 2 = 0. (It is not necessary to define (.) 


e Computation: if the computation includes division by /40 , the computation step is incorrect. 
e Conclusion: a conclusion with no context is incorrect. 


Part (d) is scored as essentially correct (E) if both estimated regression lines are graphed correctly and at least 
one is labeled. 


Part (d) is scored as partially correct (P) if: 
e the lines are graphed correctly but neither is labeled; 
OR 
e the graphs consist of unconnected dots. 


Part (d) is scored as incorrect (I) if: 


e the two lines on the grid have the same slope; 
OR 
e one line is plotted correctly and one line is not. 


Part (e) is scored as essentially correct (E) if the response includes a correct interpretation of the estimated 
coefficients, 1.05 and 0.12. Unlike in part (a) there is no y-intercept, so this statement is correct: “For each foot of 
actual distance between the two objects, a noncontact wearer perceives about 1.05 feet, and a contact wearer will 
perceive about an additional 0.12 feet.” 


Part (e) is scored as partially correct (P) if: 
e the response includes a correct interpretation of just one of the two coefficients; 
OR 
e the response includes a correct interpretation of 1.05 and 1.05 + 0.12 = 1.17 but doesn’t include a 
separate interpretation of 0.12; 
OR 
e no numbers are mentioned, but it is made clear that both groups overestimate the distance AND that 
contact wearers overestimate more than do noncontact wearers. 


Part (e) is scored as incorrect (I) if: 
e the response says only that 1.05 and 0.12 are “slopes of regression lines”; 
OR 
e only the SEs of the coefficients, 0.357 and 0.032, are interpreted. 
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Question 6 (continued) 


Each essentially correct (E) response counts as | point; each partially correct (P) response counts as % point. 


4 Complete Response 
3 Substantial Response 
2 Developing Response 
1 Minimal Response 


If a response is between two scores (for example, 2! points), use a holistic approach to determine whether 
to score up or down depending on the strength of the response and communication. 
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STATISTICS 
SECTION II 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II grade—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be graded on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. A study was designed to explore subjects’ ability to judge the distance between two objects placed in a dimly lit 
room. The researcher suspected that the subjects would generally overestimate the distance between the objects 
in the room and that this overestimation would increase the farther apart the objects were. 


The two objects were placed at random locations in the room before a subject estimated the distance (in feet) 
between those two objects. After each subject estimated the distance, the locations of the objects were 
rerandomized before the next subject viewed the room. 


After data were collected for 40 subjects, two linear models were fit in an attempt to describe the relationship 
between the subjects’ perceived distances (y) and the actual distance, in feet, between the two objects. 


Model 1: 3 = 0.238 + 1.080 X (actual distance) 
The standard errors of the estimated coefficients for Model 1 are 0.260 and 0.118, respectively. 
Model 2: » = 1.102 X (actual distance) 
The standard error of the estimated coefficient for Model 2 is 0.393. 


(a) Provide an interpretation in context for the estimated slope in Model 1. 


For every ane foot Cactlir away He twe ob es ockually 
Ore, our best estimate is that Hy percewed distance 


Roa 


wit! NCEE OSE by Lo8¢ togt Ch Averog, 
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bA2 
(b) Explain why the researcher might prefer Model 2 to Model 1 in this context. 
The researcher may thie fat He true relattins hip 
1S directly ha€ar, and Phat fF the objecks UtN In We 
same place Hey would wt be perceiled as O,22£ Keer 
apart er any Fhing rear that large, 


(c) Using Model 2, test the researcher’s hypothesis that in dim light participants overestimate the distance, with 
the overestimate increasing as the actual distance increases. (Assume appropriate conditions for inference 


are met.) 
H: @=l 
HB >! 
; va ie a Oe 
where B Us the uc slope Af the lmeor Pe bitete ch 
In Mokl 24 


Assume the sample tae were nclpentbat (Su tectatyy rondew ed) 
the rue rélayrens hi Ws hacer, hs a erst Stal? STaagerd cevridien . 
kor any actual diffatce ant I'S nec matiy desteabured ry) Phe eUStM als Cys) 


t-test Wr slope ok regreira le 


T= 9393 


Sis ge 2 3P 
PIF PO25¢5 (4, > 0.349P2 








oy iis 
~ 6 O2E45 
Theré 17 Vir taally fo ANCE thay “poe resect 5 
hypotheses & Correct, bovause HF Sub DfeCRS witrt 
infact yobasel perevers of the mskence,  resuld 
Iiceilig al Rest MIS much of ovnertaateg weuld 
Mour near 4O% of the Fine in any case’ 
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The researchers also wanted to explore whether the performance on this task differed between subjects who wear 
- contact lenses and subjects who do not wear contact lenses. A new variable was created to indicate whether or 

not a subject wears contact lenses. The data for this variable were coded numerically (1 = contact wearer, 

0 = noncontact wearer), and this new variable, named “contact,” was included in the following model. 


Model 3: ¥ = 1.05 X (actual distance) + 0.12 X (contact) X (actual distance) 
The standard errors of the estimated coefficients for Model 3 are 0.357 and 0.032, respectively. 


(d) Using Model 3, sketch the estimated regression model for contact wearers and the estimated regression 
model for noncontact wearers on the grid below. 





Eee | | A 
SePeRERAY.S. 
pit itt As 


SEBS 74s 








Perceived Distance (feet) 








123 45 67 8 9 10 
Actual Distance (feet) 


(e) In the context of this study, provide an interpretation of the estimated coefficients for Model 3. 


(calact Whar Ss Quer estaak Hr distance Mele, 


made = y= (actual distnce) + GC. OG x(actood dtance) 
£ 0,12 *(coatact) x lo Fuel dence ) 


Eweryone Ovelestimakes by gf Ye oF We ackaal bias we averoge/ 
Contack wearers overestimate by an oddtwal Bie oF the clea / 
distera 
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Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section Il grade—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be graded on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. A-study was designed to explore subjects’ ability to judge the distance between two objects placed in a dimly lit 
‘room. The researcher suspected that the subjects would generally overestimate the distance between the objects 
in the room and that this overestimation would increase the farther apart the objects were. 


The two objects were placed at random locations in the room before a subject estimated the distance (in feet) 
between those two objects. After each subject estimated the distance, the locations of the objects were 
rerandomized before the next subject viewed the room. 


After data were collected for 40 subjects, two linear models were fit in an attempt to describe the relationship 
between the subjects’ perceived distances (y) and the actual distance, in feet, between the two objects. 


Model 1: $ = 0.238 ¥ 1.080 X (actual distance) 
The standard errors of the estimated coefficients for Model 1 are 0.260 and 0.118, respectively. 
Model 2: $ = 1.102 X (actual distance) 
The standard error of the estimated coefficient for Model 2 is 0.393. | 
(a) Provide an interpretation in context for the estimated a in Model 1. 
The hope wm Modet | me te prceutd dustow ret Uckooun 
ea of 1.080 ( 


Yu opr) Ht ae WU 
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(b) Explain why the researcher might prefer Model 2 to Model 1 in this context. 

Modd 2 emight Le Yecause Bu y-inbineyt 1 0, auggutting 

teak 1) tuo ( hax a dutame f OR. bts. Bum, He cupondint 
wil mort ppot a dk. of OM. Md | gin & pureed 

dione of 0.238 pe. fr a O KH. bffeunce 


(c) Using Model 2, test the researcher’s hypothesis that in dim light participants overestimate the distance, with 
the overestimate increasing as the actual distance increases. (Assume appropriate conditions for inference 
are met.) 


O Vw vivwshd unm Bu acted mone df te brian model (2), n Yu 
tm acral dutone bdmun 2 deb wm a dumly Lt ncom. 


(2) Vth use a lima mim ttt In B. J assume Bak bey proud 
dutonce (4) Ant uvdupendind amd Hrat Yu wa wnilovm mandard 


Y} tere condibeems anwid prt, Y wil wroced wh courkon « 
8) pP(b2 1.102) p(t 1 102- p.( +2 2.4041) 


— 0.343 


(CD) Beco omy data io rbot. oir, ak tt A = 0.05 dud, Y 
ee ae ee ee eens 
amd ubtmaid doe bt, 2 okgecs mm dun Light) viv | 
enn mcrrany QD actual Ay 
mera ) - 


Ho! B=O Th 0 0 ~ thn £ cwovulattin Schaum acknal « 

Hy ‘ ers ams Ce a ee an aes 

o' B 20 The shepe io quatn than O amd Be purcenrd dustance 
Imeuang ao Hu achoal ddan unntan. 
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6B3 


The researchers also wanted to explore whether the performance on this task differed between subjects who wear 
contact lenses and subjects who do not wear contact lenses. A new variable was created to indicate whether or 
not a subject wears contact lenses. The data for this variable were coded numerically (1 = contact wearer, 

0 = noncontact wearer), and this new variable, named “contact,” was included in the following model. 


Model 3: ¥ = 1.05 X (actual distance) + 0.12 X (contact) X (actual distance) 


The standard errors of the estimated coefficients for Model 3 are 0.357 and 0.032, respectively. 


(d) Using Model 3, sketch the estimated regression model for contact wearers and the estimated regression 


model for noncontact wearers on the grid below. 
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(e) In the context of this study, provide an interpretation of the estimated coefficients for Model 3. 
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STATISTICS 
SECTION II 
Part B 
Question 6 
Spend about 25 minutes on this part of the exam. 
Percent of Section II grade—25 


Directions: Show all your work. Indicate clearly the methods you use, because you will be graded on the 
correctness of your methods as well as on the accuracy and completeness of your results and explanations. 


6. A study was designed to explore subjects’ ability to judge the distance between two objects placed in a dimly lit 
room. The researcher suspected that the subjects would generally overestimate the distance between the objects 
in the room and that this overestimation would increase the farther apart the objects were. 


The two objects were placed at random locations in the room before a subject estimated the distance (in feet) 
between those two objects. After each subject estimated the distance, the locations of the objects were 
rerandomized before the next subject viewed the room. 


After data were collected for 40 subjects, two linear models were fit in an attempt to describe the relationship 
between the subjects’ perceived distances (y) and the actual distance, in feet, between the two objects. 


Model 1: § = 0.238 + 1.080 X (actual distance) 


The standard errors of the estimated coefficients for Model 1 are 0.260 and 0.118, respectively. 


A 


Model 2: » = 1.102 X (actual distance) 


The standard error of the estimated coefficient for Model 2 is 0.393. 


(a) Provide an interpretation in context for the estimated slope in Model 1. 
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(b) Explain why the researcher might prefer Model 2 to Model 1 in this context. 7 ee 
Tho yestarchir m Land pr eflex modek 2 beau thi aon bancep 
s O Ww cortams onlay moe esrmated value anc 
+irufore has less Vaaw ceiotlty The suket doesn + stuart 
off Det austometica' NOL When tre (L5ATCHYLR 
computes “he expected vatues wien the actal distance i$ tery, 


(c) Using Model 2, test the researcher’s hypothesis that in dim light participants overestimate the distance, with 
the overestimate increasing as the actual distance increases. (Assume appropriate conditions for inference 


aremet.) | near Roavession 4—test a AL Sloe between Ane 
objects, and tha percwed 
distance mw dum light 
ie (2 = \ 7 There IS no Aifferenc between the actyal aistance and 
the per cimed distance in dim orn 


He QF : AS the attual distance between tng objects 
WMETE RSLS tne distance perciouod by +ho pad cepa vor 
IncreaSeS More. 
ee bes b5=\ 02 Se = 0.293 
\nPeve Une es aaa motel 2 Standard error 
(oven) Nears value. 
X= O05 


ge eb, o2-\ 





Se 0, 24% = 0.234 
4-disth bom P (4 5 2E5) 2 peeesane 
Fail to veject ae 
There isn Sekiaent evidence, 7 seyect Tes 
4” = |. 6% 


P (Le 7 | O8te) =0.05 The slope between Ho actial sans q 
perueved distance is egual 10! rte 


were TYUL, WL wovid (2 SultS 
OIL NL 24%), of +h Ime Ths YD 


i" — |eugl . 
Sani ant ak the 0 66 Ae NEXT PAGE. 
-15- 


©2007 The College Board. All rights reserved. 
Visit apcentral.collegeboard.com (for AP professionals) and www.collegeboard.com/apstudents (for students and parents). 


6C3 


The researchers also wanted to explore whether the performance on this task differed between subjects who wear 
contact lenses and subjects who do not wear contact lenses. A new variable was created to indicate whether or 
not a subject wears contact lenses. The data for this variable were coded numerically (1 = contact wearer, 

0 = noncontact wearer), and this new variable, named “contact,” was included in the following model. 


Model 3: = 1.05 X (actual distance) + 0.12 X (contact) X (actual distance) 
The standard errors of the estimated coefficients for Model 3 are 0.357 and 0.032, respectively. 


(d) Using Model 3, sketch the estimated regression model for contact wearers and the estimated regression 
model for noncontact wearers on the grid below. 
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(e) In the context of this study, provide an interpretation of the estimated coefficients for Model 3. 
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Question 6 
Overview 


This question was designed to evaluate a student’s ability to make inferences for simple linear regression models. 
Interpreting model parameters and comparing and contrasting different models are important skills that are also 
being assessed. Finally, a multiple regression model with a special variable—an indicator variable—is introduced 
to investigate whether the relationship between the predictor and response variable differs for two different groups 
of people. Students are asked to sketch the estimated line for both groups and interpret the estimated parameters in 
the multiple regression model. 


Sample: 6A 
Score: 4 


This outstanding response completely, concisely, and correctly answers all parts of this investigative task. Three 
insights were expected and are made: subjects are unlikely to perceive a distance greater than zero when the 
distance is zero, the researcher suspects that # >1, and the effect of the indicator variable in Model 3 is to produce 


two lines. Part (a) gives a good interpretation of the estimated slope, making it clear that for every additional foot of 
actual distance, we estimate (or predict) that the subject will perceive 1.08 additional feet. Part (b) gives a correct 
explanation, but the term “directly proportional” would have been better than “directly linear.” In part (c) the 
required parts of a test of significance are included: a statement of hypotheses, conditions, correct calculations, 
mechanics, and a conclusion that is based on the results of the computations. (In this particular test of significance, 
students are told that it is not necessary to state and check conditions.) The conclusion contains a good explanation 
of p-value, including the necessary qualifier that the p-value is computed assuming that the null hypothesis is true. 
The term “unbiased” in the conclusion to part (c) is used correctly. Using 38 degrees of freedom rather than 39 is 
considered a minor error. Parts (d) and (e) are concise and correct. This essay was complete in all essential ideas. 


Sample: 6B 
Score: 3 


This response demonstrates a general understanding that subjects are unlikely to perceive a distance greater than 
zero when the distance is zero and that the effect of the indicator variable in Model 3 is to produce two lines. 
However, in part (c) the response makes the common error of testing the null hypothesis of # =0. (A response 


that tested £ =0 in part (c) would not receive a score of 4.) Further, in the interpretation in part (a), 1.08 is referred 


to as a “factor.” It is unclear if the response implies that the distance a person perceives can be found by multiplying 
the actual distance by 1.08, thereby ignoring the y-intercept and the uncertainty in the estimate. The difficulty about 
uncertainty also occurs in part (e), but the response was not penalized again for this. In part (c), while the hypotheses 
are incorrect, the calculations and conclusion are appropriate for the hypotheses stated. Using 38 degrees of freedom 
rather than 39 is considered a minor error. Parts (b), (d), and (e) were scored as correct. This is a substantial response. 
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Question 6 (continued) 


Sample: 6C 
Score: 2 


The response demonstrates an understanding that the null hypothesis in part (c) should be # =1 but does not 


demonstrate an understanding that Model 2 might be preferred nor does it demonstrate an understanding that the 
model with the indicator variable produces two linear equations. Further, in part (a) the statement “the subject will 
estimate” is too deterministic because an estimated slope of 1.080 does not imply that every person overestimates by 
the same amount or even that every person overestimates. The response would have been scored correct if the word 
“about” or “approximately” were used; for example, “the perceived distance between the objects increased by 
approximately 1.080 feet.” The same difficulty occurs in part (e), but the response was not penalized again for this. 
Part (b) was scored as incorrect. In fact, the estimated standard error for Model 2 is quite large compared to the two 
estimated standard errors for Model 1. Part (c) is very well done, with a good interpretation of the p-value. The 
wording of the null hypothesis implies that the null hypothesis is that each subject will predict correctly. A better 
wording would be, “For every increase of 1 foot in actual distance, on average people perceive an increase of 1 
foot.” The single, unlabeled line in part (d) was scored as incorrect. Part (e) is a nice explanation of the estimated 
coefficients. The essay clearly illustrates a developing knowledge. 
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